Data management in R

Gerko Vink

Methodology & Statistics @ Utrecht University

2 Jun 2025

Disclaimer

I owe a debt of gratitude to many people as the thoughts and code in these slides are the process of years-long development cycles and discussions with my team, friends, colleagues and peers. When someone has contributed to the content of the slides, I have credited their authorship.

Images are either directly linked, or generated with StableDiffusion or DALL-E. That said, there is no information in this presentation that exceeds legal use of copyright materials in academic settings, or that should not be part of the public domain.

Warning

You may use any and all content in this presentation - including my name - and submit it as input to generative AI tools, with the following exception:

  • You must ensure that the content is not used for further training of the model

Slide materials and source code

Materials

Recap

Yesterday we learned

  1. How to use R and RStudio
  2. How to install packages
  3. How to use simple data containers
  4. How to do subsetting in base R with [ ] and $
  5. How to use logical operators to subset data
  6. How to adhere to code conventions en style

Today

  • Importeren en bestuderen van datasets
  • Begrijpen en toepassen van verschillende datatypes en database formats
  • Variabelen labelen en (her)coderen
  • De blauwdruk van R: frames en environments
  • Pipes
  • Formules gebruiken in functies

New packages we use

library(tibble)   # tibbles variation on data frames
library(dplyr)    # data manipulation
library(haven)    # in/exporting data
library(magrittr) # pipes
library(labelled) # labelled data manipulation
library(tidyr)    # data tidying
library(broom)    # tidying model outputs

Importing Data: Stata

stata_data <- read_dta("files/03-poverty-analysis-data-2022-rt001-housing-plus.dta")
head(stata_data)
# A tibble: 6 × 114
        hhid domain2     psu domain  gp_subdom district fortnight panel   hhid16
       <dbl> <dbl+lbl> <dbl> <dbl+l> <dbl+lbl> <chr>        <dbl> <dbl+l>  <dbl>
1 1102500401 1.1       10250 1 [Gre… 1 [Param… Paramar…         1 1 [Pan… 6.02e6
2 1102500501 1.1       10250 1 [Gre… 1 [Param… Paramar…         1 1 [Pan… 6.02e6
3 1102500502 1.1       10250 1 [Gre… 1 [Param… Paramar…         1 1 [Pan… 6.02e6
4 1102501202 1.1       10250 1 [Gre… 1 [Param… Paramar…         1 1 [Pan… 6.02e6
5 1107430501 1.1       10743 1 [Gre… 1 [Param… Paramar…         1 1 [Pan… 6.04e6
6 1107430801 1.1       10743 1 [Gre… 1 [Param… Paramar…         1 1 [Pan… 6.04e6
# ℹ 105 more variables: lat_cen <dbl>, long_cen <dbl>, result <dbl+lbl>,
#   end_date_n <date>, Year_s <dbl>, Month_s <dbl>, Day <dbl>, stratum <dbl>,
#   hhid_text <chr>, HHsize <dbl>, HHsize2 <dbl>, interv <dbl>, end_date <chr>,
#   q17_02 <dbl+lbl>, q17_03a <dbl+lbl>, q17_03b <dbl+lbl>, q17_04 <dbl+lbl>,
#   q12a <dbl+lbl>, q12_01a <dbl>, q12_01b <dbl>, q12_02a <dbl>, q12_02b <dbl>,
#   q12_03a <dbl>, q12_03b <dbl>, q12_04a <dbl>, q12_04b <dbl>,
#   q12_05 <dbl+lbl>, q13_01 <dbl+lbl>, q13_01_ot <chr>, q13_02 <dbl+lbl>, …

Importing Data: SPSS

spss_data <- read_sav("files/SUR_2023_LAPOP_AmericasBarometer_v1.0_w_orginal.sav",
                      user_na = TRUE)
spss_data2 <- read_sav("files/SUR_2023_LAPOP_AmericasBarometer_v1.0_w_orginal.sav")
head(spss_data)
# A tibble: 6 × 162
  idnum     pais          nationality     estratopri          estratosec  strata
  <dbl+lbl> <dbl+lbl>     <dbl+lbl>       <dbl+lbl>           <dbl+lbl>   <dbl+>
1 5581      27 [Suriname] 27 [Surinamese] 2702 [Wanica / Par… 2 [Medium … 2702  
2 5642      27 [Suriname] 27 [Surinamese] 2701 [Paramaribo]   1 [Large (… 2701  
3 4622      27 [Suriname] 27 [Surinamese] 2701 [Paramaribo]   1 [Large (… 2701  
4 4034      27 [Suriname] 27 [Surinamese] 2702 [Wanica / Par… 2 [Medium … 2702  
5 9206      27 [Suriname] 27 [Surinamese] 2701 [Paramaribo]   1 [Large (… 2701  
6 2101      27 [Suriname] 27 [Surinamese] 2702 [Wanica / Par… 2 [Medium … 2702  
# ℹ 156 more variables: prov <dbl+lbl>, municipio <dbl+lbl>, upm <dbl+lbl>,
#   ur <dbl+lbl>, cluster <dbl+lbl>, year <dbl+lbl>, wave <dbl+lbl>,
#   wt <dbl+lbl>, q1tc_r <dbl+lbl>, q2 <dbl+lbl>, a4n <dbl+lbl>,
#   soct2 <dbl+lbl>, idio2 <dbl+lbl>, mesfut1 <dbl+lbl>, cp8 <dbl+lbl>,
#   it1 <dbl+lbl>, jc10 <dbl+lbl>, jc13 <dbl+lbl>, jc15a <dbl+lbl>,
#   jc16a <dbl+lbl>, vic1ext <dbl+lbl>, aoj11 <dbl+lbl>, aoj12 <dbl+lbl>,
#   pese1 <dbl+lbl>, pese2 <dbl+lbl>, aoj17 <dbl+lbl>, ivol24 <dbl+lbl>, …

Quick inspection: missingness

sum(is.na(stata_data)) # total number of NAs
[1] 31355
sum(is.na(spss_data)) # total number of NAs
[1] 28174
sum(is.na(spss_data2)) # total number of NAs
[1] 28174

Let’s look at gender

head(spss_data$q1tc_r, n = 20)
<labelled_spss<double>[20]>: Gender
 [1] 888888      2      1      2      1      2      2      1      1      2
[11]      1      2      1      2      1      2      1      1      1      2
Missing values: 888888, 988888, 999999

Labels:
  value                                    label
      1                                 Man/male
      2                             Woman/female
      3 Does not identify as either man or woman
 888888                                       DK
 988888                                       NR
 999999                                      N/A
head(spss_data2$q1tc_r, n = 20)
<labelled<double>[20]>: Gender
 [1] NA  2  1  2  1  2  2  1  1  2  1  2  1  2  1  2  1  1  1  2

Labels:
  value                                    label
      1                                 Man/male
      2                             Woman/female
      3 Does not identify as either man or woman
 888888                                       DK
 988888                                       NR
 999999                                      N/A

Let’s look at gender

table(spss_data$q1tc_r)

     1      2      3 888888 988888 
   743    746      1     13     36 
table(spss_data2$q1tc_r)

  1   2   3 
743 746   1 

Correcting the missings

In base R, there is only one type of missing value: NA. In SPSS and Stata, there are multiple types of missing values. In R, we can use the haven package to convert these to NA.

spss_data <- zap_missing(spss_data) # set all special missing values to NA


Let’s look at gender again

table(spss_data$q1tc_r)

  1   2   3 
743 746   1 
table(spss_data2$q1tc_r)

  1   2   3 
743 746   1 

Tagged missing values

Alternatively, if you’d still like to use special NAs, you can use haven::tag_na()

x <- c(1:5, tagged_na("a"), tagged_na("z"), NA)

# Tagged NA's work identically to regular NAs
x
[1]  1  2  3  4  5 NA NA NA
is.na(x)
[1] FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE

Exploring data sets

glimpse(spss_data)
Rows: 1,539
Columns: 162
$ idnum       <dbl+lbl> 5581, 5642, 4622, 4034, 9206, 2101, 3574,  709, 8666, …
$ pais        <dbl+lbl> 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27…
$ nationality <dbl+lbl> 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27…
$ estratopri  <dbl+lbl> 2702, 2701, 2701, 2702, 2701, 2702, 2701, 2701, 2701, …
$ estratosec  <dbl+lbl> 2, 1, 1, 2, 1, 2, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 2, 1, …
$ strata      <dbl+lbl> 2702, 2701, 2701, 2702, 2701, 2702, 2701, 2701, 2701, …
$ prov        <dbl+lbl> 2702, 2701, 2701, 2702, 2701, 2702, 2701, 2701, 2701, …
$ municipio   <dbl+lbl> 270214, 270109, 270109, 270214, 270109, 270214, 270109…
$ upm         <dbl+lbl> 43, 21, 21, 43, 21, 43, 21, 21, 21, 21, 21, 21, 43, 21…
$ ur          <dbl+lbl> 2, 1, 1, 2, 1, 2, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 2, 1, …
$ cluster     <dbl+lbl>  87,  67,   5, 234,  67, 234, 233, 233,   5,  92,  67,…
$ year        <dbl+lbl> 2023, 2023, 2023, 2023, 2023, 2023, 2023, 2023, 2023, …
$ wave        <dbl+lbl> 2023, 2023, 2023, 2023, 2023, 2023, 2023, 2023, 2023, …
$ wt          <dbl+lbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
$ q1tc_r      <dbl+lbl> NA,  2,  1,  2,  1,  2,  2,  1,  1,  2,  1,  2,  1,  2…
$ q2          <dbl+lbl> 59, 61, 30, 36, 34, 53, 75, 27, 27, 50, 60, 34, 50, 30…
$ a4n         <dbl+lbl>  1,  1,  1,  1,  1,  1,  1,  1, 77,  1,  1,  1,  1, 77…
$ soct2       <dbl+lbl> 3, 3, 3, 3, 3, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, …
$ idio2       <dbl+lbl>  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  3,  2…
$ mesfut1     <dbl+lbl>  4,  2,  3,  3,  2,  4,  3, NA,  4,  4,  3,  3,  4,  2…
$ cp8         <dbl+lbl> 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, …
$ it1         <dbl+lbl>  2,  1,  3,  1,  2,  3,  2,  2,  3, NA,  1,  1,  2,  2…
$ jc10        <dbl+lbl>  1,  2,  1, NA,  1,  2, NA, NA, NA, NA,  2,  2, NA, NA…
$ jc13        <dbl+lbl> NA, NA, NA,  1, NA, NA,  2,  2,  2,  2, NA, NA,  2,  2…
$ jc15a       <dbl+lbl>  2, NA, NA,  2, NA, NA,  2, NA,  2,  1, NA,  2, NA,  2…
$ jc16a       <dbl+lbl> NA,  2,  2, NA,  1,  2, NA,  2, NA, NA,  2, NA,  1, NA…
$ vic1ext     <dbl+lbl> 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, …
$ aoj11       <dbl+lbl> 2, 2, 4, 3, 2, 4, 1, 3, 2, 2, 2, 2, 2, 2, 3, 4, 4, 1, …
$ aoj12       <dbl+lbl> NA,  1,  1,  1,  2,  4,  3,  4,  2,  2,  2,  4,  2, NA…
$ pese1       <dbl+lbl>  3,  3,  1,  3,  3,  1,  3,  1,  2,  3,  3,  3,  2,  3…
$ pese2       <dbl+lbl>  2,  1,  1,  3,  3,  1,  2,  1,  2,  3,  2,  3,  2, NA…
$ aoj17       <dbl+lbl>  4,  4,  1,  4,  2,  4,  4,  3,  3,  4,  4,  4,  2, NA…
$ ivol24      <dbl+lbl>  0, NA,  0,  0,  0,  0,  0,  1, NA, NA,  0,  1,  0,  0…
$ gang21      <dbl+lbl> NA, NA,  2, NA,  1, NA, NA, NA,  2, NA, NA, NA,  2, NA…
$ gang22      <dbl+lbl> NA, NA,  5, NA,  1, NA, NA,  2,  2, NA, NA, NA,  2, NA…
$ gang23      <dbl+lbl> NA, NA,  2, NA,  1, NA, NA,  2, NA, NA, NA, NA,  2, NA…
$ gang24      <dbl+lbl> NA, NA,  1, NA,  2, NA, NA, NA,  1, NA, NA, NA,  2, NA…
$ aoj30       <dbl+lbl> 2, 1, 1, 1, 1, 2, 2, 1, 1, 1, 1, 1, 1, 1, 2, 1, 2, 1, …
$ diso8       <dbl+lbl>  5, NA,  1,  5,  5,  5,  5,  3,  1, NA,  5,  3,  2,  5…
$ drugt1      <dbl+lbl>  1,  1,  1,  1,  1,  1,  1,  2, NA,  1,  1,  1,  1,  1…
$ countfair1  <dbl+lbl> NA,  1,  1,  3,  2,  1,  2,  3,  2,  2,  1,  2,  2,  1…
$ countfair3  <dbl+lbl>  3,  3,  1,  2,  2,  1,  2,  1,  2,  2,  3,  2,  2,  3…
$ chm1bn      <dbl+lbl> NA, NA,  2, NA, NA, NA,  2, NA,  1,  1, NA, NA, NA, NA…
$ chm2bn      <dbl+lbl> NA, NA, NA,  1,  2, NA, NA, NA, NA, NA,  2,  2,  1,  2…
$ b0          <dbl+lbl> 1, 4, 7, 1, 4, 4, 6, 7, 3, 7, 6, 4, 5, 3, 6, 1, 7, 2, …
$ b1          <dbl+lbl> 7, 5, 4, 4, 5, 7, 4, 3, 4, 5, 4, 4, 5, 6, 5, 4, 3, 6, …
$ b2          <dbl+lbl> 6, 4, 2, 4, 4, 2, 4, 5, 4, 6, 3, 3, 4, 1, 6, 5, 1, 4, …
$ b3          <dbl+lbl> 2, 3, 1, 4, 5, 2, 2, 2, 5, 3, 4, 2, 3, 1, 5, 3, 1, 4, …
$ b4          <dbl+lbl> 5, 5, 1, 4, 5, 7, 3, 1, 5, 2, 1, 5, 4, 3, 2, 1, 2, 4, …
$ b6          <dbl+lbl>  4,  3,  7,  4,  7,  7,  7,  2,  5,  4,  1,  7,  3, NA…
$ b12         <dbl+lbl>  3,  6,  1,  5,  4,  1,  5,  1,  5,  1,  6,  4,  2, NA…
$ b13         <dbl+lbl> 2, 6, 1, 1, 5, 1, 4, 3, 3, 3, 4, 3, 5, 1, 6, 2, 1, 4, …
$ b18         <dbl+lbl> 1, 5, 1, 5, 2, 1, 3, 2, 3, 1, 3, 3, 6, 6, 5, 5, 1, 6, …
$ b21         <dbl+lbl> 7, 3, 1, 1, 4, 1, 4, 1, 4, 2, 2, 1, 4, 1, 5, 5, 1, 3, …
$ b21a        <dbl+lbl> 1, 5, 1, 1, 4, 1, 1, 3, 1, 1, 1, 1, 2, 1, 5, 1, 1, 3, …
$ b21f        <dbl+lbl> 2, 5, 1, 1, 4, 1, 5, 2, 2, 2, 1, 1, 2, 3, 5, 2, 1, 2, …
$ b31         <dbl+lbl> 3, 6, 1, 5, 3, 1, 4, 2, 2, 1, 5, 3, 3, 6, 5, 5, 3, 6, …
$ b37         <dbl+lbl> 5, 6, 1, 5, 5, 1, 5, 3, 3, 5, 4, 4, 4, 4, 6, 5, 4, 5, …
$ b47a        <dbl+lbl> 1, 5, 1, 5, 3, 1, 5, 3, 5, 4, 6, 4, 6, 7, 6, 4, 4, 6, …
$ m1          <dbl+lbl>  4,  3,  5,  5,  3,  5,  4,  4,  4,  2,  5,  5,  4, NA…
$ pop101      <dbl+lbl>  2,  4,  1,  4,  5,  1,  2,  1,  7,  6,  1,  4,  3, NA…
$ pop107      <dbl+lbl> 1, 3, 7, 7, 6, 1, 2, 1, 6, 7, 1, 2, 1, 1, 4, 5, 4, 4, …
$ ros4        <dbl+lbl> 7, 7, 5, 7, 6, 1, 7, 1, 5, 5, 7, 6, 7, 7, 6, 4, 7, 6, …
$ ing4        <dbl+lbl> 1, 7, 2, 4, 4, 1, 6, 1, 5, 3, 5, 4, 3, 5, 5, 4, 4, 6, …
$ eff1        <dbl+lbl> 3, 5, 1, 1, 3, 1, 3, 1, 2, 6, 1, 1, 1, 1, 3, 1, 1, 6, …
$ eff2        <dbl+lbl> 2, 5, 6, 4, 5, 1, 4, 1, 5, 5, 3, 4, 7, 5, 3, 1, 3, 5, …
$ dra1        <dbl+lbl> 5, 6, 7, 4, 5, 1, 1, 7, 7, 4, 2, 5, 7, 6, 5, 7, 4, 7, …
$ dra2        <dbl+lbl> 7, 6, 7, 4, 5, 1, 1, 7, 4, 6, 3, 3, 7, 7, 5, 7, 6, 6, …
$ vb21n       <dbl+lbl>  6,  6,  1,  6,  1,  6,  5,  3,  6,  6,  1,  5,  1,  1…
$ crg1        <dbl+lbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, …
$ crg2        <dbl+lbl> 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 1, 2, …
$ env2b       <dbl+lbl> 1, 1, 1, 1, 2, 2, 1, 1, 1, 2, 1, 1, 1, 2, 2, 2, 1, 1, …
$ anestg      <dbl+lbl> 2, 1, 4, 4, 2, 4, 3, 4, 2, 2, 4, 4, 4, 3, 2, 1, 4, 2, …
$ pn4         <dbl+lbl>  3,  2,  3,  3,  3,  3,  1,  2,  2,  3,  3,  3,  2,  3…
$ dem30       <dbl+lbl>  1,  1,  1,  1,  1,  2,  1,  1,  2,  1,  1,  1,  1,  1…
$ e5          <dbl+lbl>  1,  8, 10,  1,  6,  1,  8, 10,  1,  9, 10,  7, 10, 10…
$ e20sur      <dbl+lbl>  2,  1,  8, 10,  7,  1,  1, 10,  5,  7,  1,  4,  1,  1…
$ e17a        <dbl+lbl>  3, NA, 10, 10,  9, NA,  9, 10,  5, NA, NA, NA, NA, 10…
$ e17b        <dbl+lbl> NA,  8, NA, NA, NA,  1, NA, NA, NA,  6, 10,  1,  7, NA…
$ d3          <dbl+lbl>  5,  4,  6,  1,  6,  1,  3, 10,  1,  5, 10,  3,  1,  1…
$ d4          <dbl+lbl>  7,  7,  1,  4,  5,  1,  2, 10,  5,  9, 10,  4,  5,  1…
$ d6          <dbl+lbl>  1,  5,  1,  1,  6,  1,  1, 10, 10,  7, 10,  1,  9,  1…
$ d5newa      <dbl+lbl> NA,  6,  1, NA, NA,  1, NA, 10, 10,  3, 10,  1, NA,  1…
$ d5newb      <dbl+lbl>  7, NA, NA,  5,  5, NA,  1, NA, NA, NA, NA, NA,  8, NA…
$ exc2        <dbl+lbl> 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
$ exc6        <dbl+lbl> 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
$ exc7new     <dbl+lbl> NA,  2,  5,  2,  4,  1,  4,  5,  2, NA,  2,  4,  5,  2…
$ aoj2cora    <dbl+lbl>  2,  1,  2,  2,  2,  2,  1,  2,  2,  1,  2,  1,  2, NA…
$ aoj2corb    <dbl+lbl> NA,  3, NA, NA, NA, NA,  1, NA, NA,  2, NA,  1, NA, NA…
$ lib2c       <dbl+lbl> 2, 1, 1, 1, 1, 1, 3, 1, 1, 1, 2, 1, 2, 1, 2, 2, 1, 1, …
$ vb2         <dbl+lbl> 1, 2, 2, 1, 1, 1, 2, 1, 2, 1, 2, 1, 1, 1, 2, 1, 1, 2, …
$ vb3n        <dbl+lbl> 2702,   NA,   NA, 2701, 2702, 2701,   NA, 2703,   NA, …
$ vb10        <dbl+lbl> 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 1, 2, …
$ vb11        <dbl+lbl>   NA,   NA,   NA,   NA, 2702,   NA,   NA,   NA,   NA, …
$ pol1        <dbl+lbl> 4, 2, 4, 1, 3, 4, 2, 4, 2, 2, 4, 4, 2, 1, 2, 4, 3, 2, …
$ vb20        <dbl+lbl>  3,  1,  1,  3,  1,  1,  1, NA,  1,  4,  3,  4,  1, NA…
$ vb50        <dbl+lbl>  4,  3,  4,  3,  3,  3,  2,  3,  3,  3,  1,  3,  3,  4…
$ vb51        <dbl+lbl>  3, NA, NA,  3,  3,  3,  3, NA,  3,  3,  3, NA,  3, NA…
$ vb52        <dbl+lbl> NA,  2,  3, NA, NA, NA, NA,  3, NA, NA, NA,  2, NA,  3…
$ vb58        <dbl+lbl> NA, NA, NA,  3,  3,  3,  2, NA,  3, NA,  1, NA,  2,  3…
$ vb58exp     <dbl+lbl>  3,  3,  3, NA, NA, NA, NA,  1, NA,  3, NA,  3, NA, NA…
$ dvw1        <dbl+lbl> 2, 3, 2, 2, 2, 2, 3, 2, 2, 3, 3, 3, 2, 2, 3, 2, 2, 2, …
$ dvw2        <dbl+lbl> 3, 3, 2, 2, 2, 2, 3, 3, 2, 1, 3, 2, 2, 2, 2, 3, 3, 2, …
$ for5n       <dbl+lbl>  3,  4,  4,  4,  4,  4, 13, 13,  4, 13,  5,  1,  4,  4…
$ mil10a      <dbl+lbl>  2,  3,  2,  2,  2,  2, NA,  4, NA,  4, NA,  1,  4, NA…
$ mil10b      <dbl+lbl>  4,  3, NA, NA,  3,  3, NA,  4,  2,  4,  3,  2,  4, NA…
$ mil10e      <dbl+lbl>  2,  2,  2,  2,  2,  1, NA, NA,  2,  4,  2,  4,  1, NA…
$ dis11       <dbl+lbl> 4, 4, 1, 4, 4, 4, 4, 2, 4, 2, 4, 4, 4, 4, 4, 4, 4, 4, …
$ dis12       <dbl+lbl> 4, 4, 1, 4, 4, 4, 4, 2, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, …
$ mgm1        <dbl+lbl> 1, 1, 1, 1, 2, 4, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
$ mgm2        <dbl+lbl> 5, 5, 2, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, …
$ mgm3        <dbl+lbl> NA,  3,  1,  3,  1, NA,  1, NA,  1,  1,  1,  1,  1,  1…
$ wf1         <dbl+lbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, …
$ edre        <dbl+lbl> 1, 4, 3, 3, 3, 3, 4, 5, 3, 3, 6, 4, 4, 3, 3, 3, 3, 6, …
$ q3cn        <dbl+lbl> 2702,   NA,    4, 2701,    1, 2702,    5,    1,    1, …
$ q5b         <dbl+lbl> 1, 1, 1, 1, 2, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, …
$ ocup4a      <dbl+lbl> 7, 5, 1, 3, 1, 1, 6, 1, 1, 1, 1, 1, 1, 1, 4, 1, 1, 6, …
$ formal      <dbl+lbl> NA, NA,  2, NA,  1,  1, NA,  1,  1,  1,  2,  1,  1,  1…
$ q10inc      <dbl+lbl> 2703, 2706, 2705, 2706, 2710, 2704, 2711, 2711, 2706, …
$ q10e        <dbl+lbl>  3,  2,  2,  2,  2,  2,  3,  1,  1,  1,  3,  1,  3,  2…
$ q14         <dbl+lbl> 2, 2, 1, 1, 1, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 2, 2, 2, …
$ fs2         <dbl+lbl> 0, 0, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 1, 1, 0, …
$ fs212       <dbl+lbl>  0,  0,  0, NA,  1,  0,  0, NA, NA, NA,  0,  0, NA,  0…
$ fs8         <dbl+lbl> 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 0, …
$ ws1         <dbl+lbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, …
$ ws2         <dbl+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA,  4, NA, NA, NA, NA…
$ q11n        <dbl+lbl> 2, 2, 1, 3, 1, 2, 2, 3, 3, 3, 2, 1, 2, 2, 1, 1, 2, 1, …
$ q12cn       <dbl+lbl> 3, 2, 9, 6, 5, 6, 2, 2, 4, 2, 4, 6, 4, 4, 8, 6, 4, 1, …
$ q12bn       <dbl+lbl> 0, 0, 2, 2, 1, 0, 0, 0, 0, 0, 0, 2, 2, 1, 1, 0, 1, 0, …
$ q12bnf      <dbl+lbl> NA, NA,  3,  1,  1, NA, NA, NA, NA, NA, NA,  1,  1,  1…
$ q12p        <dbl+lbl> NA, 22, NA, 20, NA, 23, 29, NA, NA, 17, NA, 19, NA, 26…
$ etid        <dbl+lbl> 2709,    5,    3, 2709,    3, 2709,    3,    5,    3, …
$ leng1       <dbl+lbl> 2703, 2701, 2701, 2703, 2701, 2703, 2701, 2701, 2701, …
$ gi0n        <dbl+lbl> 3, 1, 1, 1, 2, 3, 1, 2, 2, 1, 1, 2, 1, 2, 3, 2, 1, 1, …
$ smedia1n    <dbl+lbl> 2, 1, 1, 2, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 2, …
$ smedia3n    <dbl+lbl> NA,  2,  1, NA,  1, NA,  1,  1,  1,  1,  5,  2,  1,  1…
$ smedia3b    <dbl+lbl> NA, NA,  2, NA,  2, NA,  1,  2,  1,  2, NA, NA,  3,  1…
$ smedia11    <dbl+lbl> 2, 2, 2, 1, 1, 2, 1, 1, 1, 2, 1, 1, 2, 2, 2, 1, 1, 2, …
$ smedia12    <dbl+lbl> 2, 2, 2, 1, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 1, 1, 2, 1, …
$ smedia13    <dbl+lbl> 4, 1, 2, 4, 2, 4, 4, 1, 2, 2, 1, 2, 1, 1, 1, 1, 1, 1, …
$ smedia14n   <dbl+lbl> 4, 2, 1, 2, 2, 4, 4, 1, 1, 2, 2, 2, 1, 2, 1, 1, 1, 1, …
$ smedia15    <dbl+lbl> 4, 3, 1, 4, 3, 4, 4, 1, 4, 2, 4, 4, 4, 4, 2, 1, 2, 1, …
$ smedia16    <dbl+lbl> 1, 3, 4, 1, 3, 1, 1, 3, 7, 2, 1, 1, 1, 4, 3, 3, 1, 1, …
$ r3          <dbl+lbl> 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, …
$ r4a         <dbl+lbl> 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
$ r6          <dbl+lbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
$ r7          <dbl+lbl> 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, …
$ r12         <dbl+lbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
$ r15         <dbl+lbl> 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, …
$ r18n        <dbl+lbl> 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, …
$ r18         <dbl+lbl> 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, …
$ r16         <dbl+lbl> 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
$ r27         <dbl+lbl> 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, …
$ colorr      <dbl+lbl> 5, 3, 8, 4, 9, 5, 9, 3, 9, 7, 3, 6, 3, 5, 4, 4, 5, 3, …
$ surlangq    <dbl+lbl> 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, …
$ noise1      <dbl+lbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
$ conocim     <dbl+lbl> 5, 2, 3, 4, 3, 4, 4, 1, 3, 2, 1, 2, 3, 3, 1, 3, 2, 1, …
$ sexin       <dbl+lbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, …
$ colori      <dbl+lbl> 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, …
$ formatq     <dbl+lbl> 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, …
$ idiomaq     <dbl+lbl> 2700, 2700, 2700, 2700, 2700, 2700, 2700, 2700, 2700, …
$ fecha       <chr> "20apr2023", "28mar2023", "01apr2023", "12apr2023", "30mar…

Exploring data sets

glimpse(stata_data)
Rows: 2,502
Columns: 114
$ hhid          <dbl> 1102500401, 1102500501, 1102500502, 1102501202, 11074305…
$ domain2       <dbl+lbl> 1.1, 1.1, 1.1, 1.1, 1.1, 1.1, 1.1, 1.1, 1.1, 2.0, 2.…
$ psu           <dbl> 10250, 10250, 10250, 10250, 10743, 10743, 10743, 10743, …
$ domain        <dbl+lbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ gp_subdom     <dbl+lbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ district      <chr> "Paramaribo", "Paramaribo", "Paramaribo", "Paramaribo", …
$ fortnight     <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
$ panel         <dbl+lbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ hhid16        <dbl> 6015041, 6015051, 6015052, 6015122, 6039051, 6039081, 60…
$ lat_cen       <dbl> 5.847621, 5.847621, 5.847621, 5.847621, 5.819147, 5.8191…
$ long_cen      <dbl> -55.17032, -55.17032, -55.17032, -55.17032, -55.21745, -…
$ result        <dbl+lbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ end_date_n    <date> 2022-01-04, 2022-01-05, 2022-01-10, 2022-01-04, 2022-01…
$ Year_s        <dbl> 2022, 2022, 2022, 2022, 2022, 2022, 2022, 2022, 2022, 20…
$ Month_s       <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
$ Day           <dbl> 4, 5, 10, 4, 5, 5, 5, 5, 5, 7, 7, 7, 7, 7, 7, 15, 9, 14,…
$ stratum       <dbl> 2, 2, 2, 2, 5, 5, 5, 5, 5, 4, 4, 4, 4, 4, 4, 6, 6, 6, 6,…
$ hhid_text     <chr> "01102500401", "01102500501", "01102500502", "0110250120…
$ HHsize        <dbl> 1, 3, 1, 4, 1, 2, 5, 1, 3, 1, 7, 4, 1, 1, 4, 2, 2, 2, 2,…
$ HHsize2       <dbl> 1, 2, 1, 4, 1, 2, 5, 1, 3, 1, 7, 4, 1, 1, 4, 2, 2, 2, 2,…
$ interv        <dbl> 75, 75, 75, 75, 92, 92, 92, 92, 92, 83, 77, 74, 99, 74, …
$ end_date      <chr> "04/01/22", "05/01/22", "10/01/22", "04/01/22", "05/01/2…
$ q17_02        <dbl+lbl> 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 2, 1, 2…
$ q17_03a       <dbl+lbl>  1,  1,  1, NA,  1,  1,  1,  1,  1,  1,  1,  1,  1, …
$ q17_03b       <dbl+lbl>  2,  2,  2, NA,  2,  2,  2,  2,  2,  2,  2,  2,  2, …
$ q17_04        <dbl+lbl> NA, NA, NA,  1, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ q12a          <dbl+lbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ q12_01a       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ q12_01b       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ q12_02a       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ q12_02b       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ q12_03a       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ q12_03b       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ q12_04a       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ q12_04b       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ q12_05        <dbl+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ q13_01        <dbl+lbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ q13_01_ot     <chr> "", "", "", "", "", "", "", "", "", "", "", "", "", "", …
$ q13_02        <dbl+lbl> 2, 1, 1, 5, 2, 2, 3, 2, 2, 5, 8, 1, 3, 2, 2, 3, 8, 2…
$ q13_03        <dbl+lbl>  5,  5,  1, NA,  2,  2,  5,  5,  2, NA, NA,  5,  3, …
$ q13_04        <dbl> 1985, 2006, 2003, -1, 1982, 1980, 1982, 1992, 1980, 2005…
$ q13_05        <dbl+lbl> 1, 2, 2, 2, 1, 1, 1, 2, 1, 1, 1, 2, 2, 2, 2, 2, 1, 2…
$ q13_06        <chr> "DAKPLATEN VERWISSELEN EN KEUKEN BIJGEBOU", "", "", "", …
$ q13_07        <dbl> 2020, NA, NA, NA, 2020, 2017, 2019, NA, 2017, 2021, 2020…
$ q13_08        <dbl+lbl> 2, 5, 2, 5, 1, 1, 5, 2, 1, 5, 1, 5, 5, 5, 2, 5, 2, 5…
$ q13_09        <dbl+lbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 2, 1, 1, 1, 1, 1, 1, 1…
$ q13_10        <dbl+lbl> 2, 5, 5, 1, 2, 1, 1, 1, 2, 3, 1, 5, 1, 5, 4, 1, 5, 5…
$ q13_11        <dbl+lbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ q13_12a       <dbl+lbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2…
$ q13_12b       <dbl+lbl> 2, 2, 2, 2, 1, 1, 1, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ q13_12c       <dbl+lbl> 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2…
$ q13_12d       <dbl+lbl> 2, 1, 1, 1, 2, 1, 2, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ q13_12e       <dbl+lbl> 1, 1, 1, 1, 1, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 2…
$ q13_12f       <dbl+lbl> 2, 1, 1, 2, 1, 1, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ q13_12g       <dbl+lbl> 2, 1, 2, 2, 2, 2, 1, 1, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2…
$ q13_13        <dbl+lbl> 4, 4, 7, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4…
$ q13_14        <dbl+lbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ q13_14_ot     <chr> "", "", "", "", "", "", "", "", "", "", "", "", "", "", …
$ q13_15        <dbl+lbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 4, 4, 5…
$ q13_16        <dbl+lbl>  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1, …
$ q13_17        <dbl+lbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ q13_18        <dbl> 2, 4, 2, 2, 4, 3, 4, 5, 3, 4, 4, 4, 4, 3, 5, 2, 4, 4, 2,…
$ q13_19        <dbl> 2, 4, 2, 2, 2, 3, 4, 2, 3, 2, 3, 4, 3, 2, 5, 2, 3, 3, 2,…
$ q13_20        <dbl> 1, 2, 2, 1, 1, 2, 3, 1, 2, 3, 2, 2, 2, 1, 2, 2, 3, 2, 2,…
$ q13_21        <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ q13_22        <dbl> 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0,…
$ q13_23a       <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 1, 1,…
$ q13_23b       <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,…
$ q13_23c       <dbl> 1, 0, 1, 1, 2, 2, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 4, 1, 2,…
$ q13_23d       <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,…
$ q13_23e       <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,…
$ q13_23f       <dbl> 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0,…
$ q13_23h       <dbl> 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,…
$ q13_23i       <dbl> 0, 1, 1, 0, 1, 2, 2, 1, 2, 1, 2, 1, 0, 0, 2, 0, 2, 0, 1,…
$ q13_23j       <dbl> 1, 2, 3, 1, 1, 2, 5, 1, 2, 1, 5, 4, 1, 0, 4, 0, 2, 0, 1,…
$ q13_23k       <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 2, 1, 1, 1,…
$ q13_23l       <dbl> 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 2, 1, 0, 1, 0,…
$ q13_23m       <dbl> 0, 1, 0, 0, 0, 0, 2, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0,…
$ q13_23n       <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ q13_24        <dbl+lbl> 2, 2, 2, 2, 1, 1, 1, 2, 1, 2, 2, 3, 5, 3, 2, 1, 2, 2…
$ q19_01a       <dbl+lbl> 1, 1, 2, 1, 1, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 1…
$ q19_01b       <dbl+lbl>  1,  2,  2,  1,  1,  2,  2,  2,  2,  2,  2,  2,  2, …
$ q19_01c       <dbl+lbl>  1,  2,  2,  1,  1,  2,  2,  2,  2,  2,  1,  2,  2, …
$ q19_01d       <dbl+lbl> 1, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 1…
$ q19_01e       <dbl+lbl> 1, 1, 2, 1, 2, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 1…
$ q19_01f       <dbl+lbl>  1,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  2,  1, …
$ q19_01g       <dbl+lbl> 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1…
$ q19_01h       <dbl+lbl> 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1…
$ q20_01_2001a  <dbl+lbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2, 1, 2, 2, 2, 2, 2…
$ q20_01_2001b  <dbl+lbl> 1, 2, 2, 1, 2, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2…
$ q20_01_2001c  <dbl+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ q20_01_2001d  <dbl+lbl> 1, 1, 2, 1, 1, 2, 2, 2, 2, 2, 2, 1, 1, 2, 2, 1, 2, 1…
$ q20_01_2001e  <dbl+lbl> 1, 2, 2, 1, 2, 2, 2, 2, 2, 1, 2, 1, 1, 2, 2, 2, 1, 2…
$ q20_01_2001f  <dbl+lbl> 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ q20_01_2001g  <dbl+lbl> 2, 1, 2, 2, 2, 2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ q20_01_2001h  <dbl+lbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ q20_01_2001i  <chr> "", "", "", "", "", "", "", "", "", "", "", "", "", "", …
$ weight2       <dbl> 36.20733, 50.22768, 36.41680, 32.17578, 56.93408, 64.684…
$ weight3       <dbl> 36.20733, 150.68304, 36.41680, 128.70313, 56.93408, 129.…
$ quintile      <dbl+lbl> 5, 3, 5, 2, 5, 5, 4, 5, 4, 5, 4, 5, 4, 2, 4, 5, 5, 3…
$ decile        <dbl+lbl> 10,  6, 10,  4, 10, 10,  8, 10,  7, 10,  7,  9,  8, …
$ centile       <dbl> 100, 60, 98, 33, 98, 99, 75, 99, 66, 99, 65, 89, 78, 29,…
$ cons_pc       <dbl> 22965.266, 4823.205, 14460.016, 3115.352, 15724.735, 180…
$ food_pc       <dbl> 14684.6487, 1991.8493, 5201.2645, 1488.5779, 5734.7305, …
$ nonfood_pc    <dbl> 8280.6176, 2831.3561, 9258.7511, 1626.7739, 9990.0040, 1…
$ line_extreme  <dbl> 1011.0048, 1011.0048, 1011.0048, 1011.0048, 1011.0048, 1…
$ line_moderate <dbl> 2659.820, 2659.820, 2659.820, 2659.820, 2659.820, 2659.8…
$ poor_extreme  <dbl+lbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ poor_all      <dbl+lbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ poor          <dbl+lbl> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3…
$ Year          <dbl> 2022, 2022, 2022, 2022, 2022, 2022, 2022, 2022, 2022, 20…
$ CPI_june2017  <dbl> 127.2, 127.2, 127.2, 127.2, 127.2, 127.2, 127.2, 127.2, …
$ CPI_june2022  <dbl> 460.825, 460.825, 460.825, 460.825, 460.825, 460.825, 46…
$ CPI_2017_22   <dbl> 3.622838, 3.622838, 3.622838, 3.622838, 3.622838, 3.6228…

Labels and factors

Currently, the q1tc_r variable is a numeric vector - even though the SPSS labels are still recorded. In R, we often use factors to represent categorical data. Factors are stored as integers with labels attached.

is.factor(spss_data$q1tc_r)
[1] FALSE

We can easily convert the q1tc_r variable to a factor using the haven package’s as_factor() function, which will also preserve the labels.

spss_data <- as_factor(spss_data)
is.factor(spss_data$q1tc_r)
[1] TRUE

Exploring again - factored

glimpse(as_factor(spss_data))
Rows: 1,539
Columns: 162
$ idnum       <fct> 5581, 5642, 4622, 4034, 9206, 2101, 3574, 709, 8666, 1566,…
$ pais        <fct> Suriname, Suriname, Suriname, Suriname, Suriname, Suriname…
$ nationality <fct> Surinamese, Surinamese, Surinamese, Surinamese, Surinamese…
$ estratopri  <fct> Wanica / Para, Paramaribo, Paramaribo, Wanica / Para, Para…
$ estratosec  <fct> "Medium (Between 3,000 and 10,000 inhabitants)", "Large (M…
$ strata      <fct> 2702, 2701, 2701, 2702, 2701, 2702, 2701, 2701, 2701, 2701…
$ prov        <fct> Wanica, Paramaribo, Paramaribo, Wanica, Paramaribo, Wanica…
$ municipio   <fct> Saramacca Polder, Flora, Flora, Saramacca Polder, Flora, S…
$ upm         <fct> 43, 21, 21, 43, 21, 43, 21, 21, 21, 21, 21, 21, 43, 21, 21…
$ ur          <fct> Rural, Urban, Urban, Rural, Urban, Rural, Urban, Urban, Ur…
$ cluster     <fct> 87, 67, 5, 234, 67, 234, 233, 233, 5, 92, 67, 233, 234, 5,…
$ year        <fct> 2023, 2023, 2023, 2023, 2023, 2023, 2023, 2023, 2023, 2023…
$ wave        <fct> 2023, 2023, 2023, 2023, 2023, 2023, 2023, 2023, 2023, 2023…
$ wt          <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ q1tc_r      <fct> NA, Woman/female, Man/male, Woman/female, Man/male, Woman/…
$ q2          <fct> 59, 61, 30, 36, 34, 53, 75, 27, 27, 50, 60, 34, 50, 30, 18…
$ a4n         <fct> "Economic issues", "Economic issues", "Economic issues", "…
$ soct2       <fct> Worse, Worse, Worse, Worse, Worse, Same, Worse, Worse, Wor…
$ idio2       <fct> Worse, Worse, Worse, Worse, Worse, Worse, Worse, Worse, Wo…
$ mesfut1     <fct> Not at all, Somewhat, A little, A little, Somewhat, Not at…
$ cp8         <fct> Never, Never, Never, Never, Never, Never, Never, Never, Ne…
$ it1         <fct> Somewhat trustworthy, Very trustworthy, Not very trustwort…
$ jc10        <fct> A military take-over of the state would be justified, A mi…
$ jc13        <fct> NA, NA, NA, A military take-over of the state would be jus…
$ jc15a       <fct> "No, it is not justified", NA, NA, "No, it is not justifie…
$ jc16a       <fct> NA, "No, it is not justified", "No, it is not justified", …
$ vic1ext     <fct> No, No, No, No, No, No, No, Yes, No, No, No, No, No, No, N…
$ aoj11       <fct> Somewhat safe, Somewhat safe, Very unsafe, Somewhat unsafe…
$ aoj12       <fct> NA, A lot, A lot, A lot, Some, None, Little, None, Some, S…
$ pese1       <fct> Lower, Lower, Higher, Lower, Lower, Higher, Lower, Higher,…
$ pese2       <fct> About the same, Higher, Higher, Lower, Lower, Higher, Abou…
$ aoj17       <fct> None, None, A lot, None, Somewhat, None, None, Little, Lit…
$ ivol24      <fct> No, NA, No, No, No, No, No, Yes, NA, NA, No, Yes, No, No, …
$ gang21      <fct> NA, NA, One person, NA, No, NA, NA, NA, One person, NA, NA…
$ gang22      <fct> NA, NA, Four or more persons, NA, No, NA, NA, One person, …
$ gang23      <fct> NA, NA, No, NA, Yes, NA, NA, No, NA, NA, NA, NA, No, NA, N…
$ gang24      <fct> NA, NA, Yes, NA, No, NA, NA, NA, Yes, NA, NA, NA, No, NA, …
$ aoj30       <fct> No, Yes, Yes, Yes, Yes, No, No, Yes, Yes, Yes, Yes, Yes, Y…
$ diso8       <fct> Not a problem, NA, Very serious, Not a problem, Not a prob…
$ drugt1      <fct> Very serious, Very serious, Very serious, Very serious, Ve…
$ countfair1  <fct> NA, Always, Always, Never, Sometimes, Always, Sometimes, N…
$ countfair3  <fct> Never, Never, Always, Sometimes, Sometimes, Always, Someti…
$ chm1bn      <fct> NA, NA, "To be able to vote to elect the authorities, even…
$ chm2bn      <fct> NA, NA, NA, "A system that guarantees access to a basic in…
$ b0          <fct> Not at all, 4, A lot, Not at all, 4, 4, 6, A lot, 3, A lot…
$ b1          <fct> A lot, 5, 4, 4, 5, A lot, 4, 3, 4, 5, 4, 4, 5, 6, 5, 4, 3,…
$ b2          <fct> 6, 4, 2, 4, 4, 2, 4, 5, 4, 6, 3, 3, 4, Not at all, 6, 5, N…
$ b3          <fct> 2, 3, Not at all, 4, 5, 2, 2, 2, 5, 3, 4, 2, 3, Not at all…
$ b4          <fct> 5, 5, Not at all, 4, 5, A lot, 3, Not at all, 5, 2, Not at…
$ b6          <fct> 4, 3, A lot, 4, A lot, A lot, A lot, 2, 5, 4, Not at all, …
$ b12         <fct> 3, 6, Not at all, 5, 4, Not at all, 5, Not at all, 5, Not …
$ b13         <fct> 2, 6, Not at all, Not at all, 5, Not at all, 4, 3, 3, 3, 4…
$ b18         <fct> Not at all, 5, Not at all, 5, 2, Not at all, 3, 2, 3, Not …
$ b21         <fct> A lot, 3, Not at all, Not at all, 4, Not at all, 4, Not at…
$ b21a        <fct> Not at all, 5, Not at all, Not at all, 4, Not at all, Not …
$ b21f        <fct> 2, 5, Not at all, Not at all, 4, Not at all, 5, 2, 2, 2, N…
$ b31         <fct> 3, 6, Not at all, 5, 3, Not at all, 4, 2, 2, Not at all, 5…
$ b37         <fct> 5, 6, Not at all, 5, 5, Not at all, 5, 3, 3, 5, 4, 4, 4, 4…
$ b47a        <fct> Not at all, 5, Not at all, 5, 3, Not at all, 5, 3, 5, 4, 6…
$ m1          <fct> Bad, Neither good nor bad (fair), Very bad, Very bad, Neit…
$ pop101      <fct> 2, 4, Strongly disagree, 4, 5, Strongly disagree, 2, Stron…
$ pop107      <fct> Strongly disagree, 3, Strongly agree, Strongly agree, 6, S…
$ ros4        <fct> Strongly agree, Strongly agree, 5, Strongly agree, 6, Stro…
$ ing4        <fct> Strongly disagree, Strongly agree, 2, 4, 4, Strongly disag…
$ eff1        <fct> 3, 5, Strongly disagree, Strongly disagree, 3, Strongly di…
$ eff2        <fct> 2, 5, 6, 4, 5, Strongly disagree, 4, Strongly disagree, 5,…
$ dra1        <fct> 5, 6, Strongly agree, 4, 5, Strongly disagree, Strongly di…
$ dra2        <fct> Strongly agree, 6, Strongly agree, 4, 5, Strongly disagree…
$ vb21n       <fct> "It is not possible to have an influence to change things"…
$ crg1        <fct> "No, it is not justifiable", "No, it is not justifiable", …
$ crg2        <fct> "No, it is not justifiable", "No, it is not justifiable", …
$ env2b       <fct> Very serious, Very serious, Very serious, Very serious, So…
$ anestg      <fct> Somewhat, A lot, Not at all, Not at all, Somewhat, Not at …
$ pn4         <fct> Dissatisfied, Satisfied, Dissatisfied, Dissatisfied, Dissa…
$ dem30       <fct> Yes, Yes, Yes, Yes, Yes, No, Yes, Yes, No, Yes, Yes, Yes, …
$ e5          <fct> Strongly disapprove, 8, Strongly approve, Strongly disappr…
$ e20sur      <fct> 2, Strongly disapprove, 8, Strongly approve, 7, Strongly d…
$ e17a        <fct> 3, NA, Strongly approve, Strongly approve, 9, NA, 9, Stron…
$ e17b        <fct> NA, 8, NA, NA, NA, Strongly disapprove, NA, NA, NA, 6, Str…
$ d3          <fct> 5, 4, 6, Strongly disapprove, 6, Strongly disapprove, 3, S…
$ d4          <fct> 7, 7, Strongly disapprove, 4, 5, Strongly disapprove, 2, S…
$ d6          <fct> Strongly disapprove, 5, Strongly disapprove, Strongly disa…
$ d5newa      <fct> NA, 6, Strongly disapprove, NA, NA, Strongly disapprove, N…
$ d5newb      <fct> 7, NA, NA, 5, 5, NA, Strongly disapprove, NA, NA, NA, NA, …
$ exc2        <fct> No, No, No, No, No, No, No, No, Yes, No, No, No, No, No, N…
$ exc6        <fct> No, No, No, No, No, No, No, No, Yes, No, No, No, No, No, N…
$ exc7new     <fct> NA, Less than half of them, All, Less than half of them, M…
$ aoj2cora    <fct> No, Yes, No, No, No, No, Yes, No, No, Yes, No, Yes, No, NA…
$ aoj2corb    <fct> NA, Not so serious, NA, NA, NA, NA, Very serious, NA, NA, …
$ lib2c       <fct> Enough, Very little, Very little, Very little, Very little…
$ vb2         <fct> Voted, Did not vote, Did not vote, Voted, Voted, Voted, Di…
$ vb3n        <fct> NDP - National Democratic Party, NA, NA, VHP - Progressive…
$ vb10        <fct> No, No, No, No, Yes, No, No, No, No, No, Yes, No, No, No, …
$ vb11        <fct> NA, NA, NA, NA, National Democratic Party - NDP, NA, NA, N…
$ pol1        <fct> None, Some, None, A lot, Little, None, Some, None, Some, S…
$ vb20        <fct> Would vote for a candidate or party different from the cur…
$ vb50        <fct> Strongly disagree, Disagree, Strongly disagree, Disagree, …
$ vb51        <fct> Both the same, NA, NA, Both the same, Both the same, Both …
$ vb52        <fct> NA, A woman, It does not matter, NA, NA, NA, NA, It does n…
$ vb58        <fct> NA, NA, NA, Disagree, Disagree, Disagree, Agree, NA, Disag…
$ vb58exp     <fct> Disagree, Disagree, Disagree, NA, NA, NA, NA, Strongly agr…
$ dvw1        <fct> Would not approve but understand, Would not approve nor un…
$ dvw2        <fct> Would not approve nor understand, Would not approve nor un…
$ for5n       <fct> India, United States, United States, United States, United…
$ mil10a      <fct> Somewhat trustworthy, Not very trustworthy, Somewhat trust…
$ mil10b      <fct> Not at all trustworthy, Not very trustworthy, NA, NA, Not …
$ mil10e      <fct> Somewhat trustworthy, Somewhat trustworthy, Somewhat trust…
$ dis11       <fct> Never, Never, Many times, Never, Never, Never, Never, Some…
$ dis12       <fct> Never, Never, Many times, Never, Never, Never, Never, Some…
$ mgm1        <fct> Very harmful, Very harmful, Very harmful, Very harmful, So…
$ mgm2        <fct> There Is no gold mining where you live, There Is no gold m…
$ mgm3        <fct> NA, Too much, Very little, Too much, Very little, NA, Very…
$ wf1         <fct> No, No, No, No, No, No, No, No, No, No, No, No, No, No, No…
$ edre        <fct> "Primary School incomplete", "Secondary (Middle through Hi…
$ q3cn        <fct> "Hindu", NA, "None (Believes in a Supreme Entity but does …
$ q5b         <fct> Very important, Very important, Very important, Very impor…
$ ocup4a      <fct> "Not working and not looking for a job?", "Taking care of …
$ formal      <fct> NA, NA, No, NA, Yes, Yes, NA, Yes, Yes, Yes, No, Yes, Yes,…
$ q10inc      <fct> "Between SRD 1,501 - SRD 2,500", "Between SRD 5,201 - SRD …
$ q10e        <fct> Decreased?, Remained the same?, Remained the same?, Remain…
$ q14         <fct> No, No, Yes, Yes, Yes, No, No, No, No, No, Yes, Yes, Yes, …
$ fs2         <fct> No, No, No, Yes, No, No, No, Yes, Yes, Yes, No, No, Yes, N…
$ fs212       <fct> No, No, No, NA, Yes, No, No, NA, NA, NA, No, No, NA, No, N…
$ fs8         <fct> No, No, Yes, No, No, Yes, No, No, Yes, No, No, No, Yes, No…
$ ws1         <fct> No, No, No, No, No, No, No, No, No, Yes, No, No, No, No, N…
$ ws2         <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, Always, NA, NA, NA, NA…
$ q11n        <fct> Married, Married, Single, Common law marriage (Living toge…
$ q12cn       <fct> 3, 2, 9, 6, 5, 6, 2, 2, 4, 2, 4, 6, 4, 4, 8, 6, 4, 1, 5, 7…
$ q12bn       <fct> None, None, 2, 2, 1, None, None, None, None, None, None, 2…
$ q12bnf      <fct> NA, NA, "No", "Yes, parent", "Yes, parent", NA, NA, NA, NA…
$ q12p        <fct> NA, 22, NA, 20, NA, 23, 29, NA, NA, 17, NA, 19, NA, 26, NA…
$ etid        <fct> Hindustani ("East Indians"), Mixed, Afro-Surinamese/Creole…
$ leng1       <fct> Sarnami, Dutch, Dutch, Sarnami, Dutch, Sarnami, Dutch, Dut…
$ gi0n        <fct> A few times a month, Daily, Daily, Daily, A few times a we…
$ smedia1n    <fct> No, Yes, Yes, No, Yes, No, Yes, Yes, Yes, Yes, Yes, Yes, Y…
$ smedia3n    <fct> NA, A few times a week, Daily, NA, Daily, NA, Daily, Daily…
$ smedia3b    <fct> NA, NA, Several times a day, NA, Several times a day, NA, …
$ smedia11    <fct> No, No, No, Yes, Yes, No, Yes, Yes, Yes, No, Yes, Yes, No,…
$ smedia12    <fct> No, No, No, Yes, No, No, No, Yes, No, No, No, No, No, No, …
$ smedia13    <fct> Not at all, Very much, Somewhat, Not at all, Somewhat, Not…
$ smedia14n   <fct> Not at all, Somewhat, Very much, Somewhat, Somewhat, Not a…
$ smedia15    <fct> Not at all, A Little, Very much, Not at all, A Little, Not…
$ smedia16    <fct> Phone call, Text message, In person, Phone call, Text mess…
$ r3          <fct> No, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes,…
$ r4a         <fct> Yes, Yes, Yes, Yes, Yes, No, Yes, Yes, Yes, Yes, Yes, Yes,…
$ r6          <fct> Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes…
$ r7          <fct> No, Yes, No, Yes, Yes, Yes, No, Yes, Yes, Yes, Yes, Yes, N…
$ r12         <fct> Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes…
$ r15         <fct> No, Yes, Yes, No, Yes, No, No, Yes, No, No, Yes, Yes, Yes,…
$ r18n        <fct> No, Yes, Yes, No, Yes, No, No, Yes, Yes, No, Yes, Yes, Yes…
$ r18         <fct> No, Yes, Yes, No, Yes, No, Yes, Yes, Yes, Yes, Yes, Yes, Y…
$ r16         <fct> Yes, Yes, Yes, No, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes,…
$ r27         <fct> No, Yes, Yes, No, Yes, No, No, No, No, No, Yes, No, Yes, N…
$ colorr      <fct> 5, 3, 8, 4, 9, 5, 9, 3, 9, 7, 3, 6, 3, 5, 4, 4, 5, 3, 8, 5…
$ surlangq    <fct> The full interview was conducted in Sranan Tongo, The full…
$ noise1      <fct> "No", "No", "No", "No", "No", "No", "No", "No", "No", "No"…
$ conocim     <fct> Very low, High, Neither high or low, Low, Neither high or …
$ sexin       <fct> Female/Woman, Female/Woman, Female/Woman, Female/Woman, Fe…
$ colori      <fct> 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9…
$ formatq     <fct> STG, STG, STG, STG, STG, STG, STG, STG, STG, STG, STG, STG…
$ idiomaq     <fct> See SURLANGQ, See SURLANGQ, See SURLANGQ, See SURLANGQ, Se…
$ fecha       <chr> "20apr2023", "28mar2023", "01apr2023", "12apr2023", "30mar…

Exploring again - factored

glimpse(as_factor(stata_data))
Rows: 2,502
Columns: 114
$ hhid          <dbl> 1102500401, 1102500501, 1102500502, 1102501202, 11074305…
$ domain2       <fct> 1.1, 1.1, 1.1, 1.1, 1.1, 1.1, 1.1, 1.1, 1.1, Rest of the…
$ psu           <dbl> 10250, 10250, 10250, 10250, 10743, 10743, 10743, 10743, …
$ domain        <fct> Great Paramaribo, Great Paramaribo, Great Paramaribo, Gr…
$ gp_subdom     <fct> Paramaribo, Paramaribo, Paramaribo, Paramaribo, Paramari…
$ district      <chr> "Paramaribo", "Paramaribo", "Paramaribo", "Paramaribo", …
$ fortnight     <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
$ panel         <fct> Panel, Panel, Panel, Panel, Panel, Panel, Panel, Panel, …
$ hhid16        <dbl> 6015041, 6015051, 6015052, 6015122, 6039051, 6039081, 60…
$ lat_cen       <dbl> 5.847621, 5.847621, 5.847621, 5.847621, 5.819147, 5.8191…
$ long_cen      <dbl> -55.17032, -55.17032, -55.17032, -55.17032, -55.21745, -…
$ result        <fct> Interview finalized - Fully completed, Interview finaliz…
$ end_date_n    <date> 2022-01-04, 2022-01-05, 2022-01-10, 2022-01-04, 2022-01…
$ Year_s        <dbl> 2022, 2022, 2022, 2022, 2022, 2022, 2022, 2022, 2022, 20…
$ Month_s       <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
$ Day           <dbl> 4, 5, 10, 4, 5, 5, 5, 5, 5, 7, 7, 7, 7, 7, 7, 15, 9, 14,…
$ stratum       <dbl> 2, 2, 2, 2, 5, 5, 5, 5, 5, 4, 4, 4, 4, 4, 4, 6, 6, 6, 6,…
$ hhid_text     <chr> "01102500401", "01102500501", "01102500502", "0110250120…
$ HHsize        <dbl> 1, 3, 1, 4, 1, 2, 5, 1, 3, 1, 7, 4, 1, 1, 4, 2, 2, 2, 2,…
$ HHsize2       <dbl> 1, 2, 1, 4, 1, 2, 5, 1, 3, 1, 7, 4, 1, 1, 4, 2, 2, 2, 2,…
$ interv        <dbl> 75, 75, 75, 75, 92, 92, 92, 92, 92, 83, 77, 74, 99, 74, …
$ end_date      <chr> "04/01/22", "05/01/22", "10/01/22", "04/01/22", "05/01/2…
$ q17_02        <fct> Yes, Yes, Yes, No, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Ye…
$ q17_03a       <fct> YES, YES, YES, NA, YES, YES, YES, YES, YES, YES, YES, YE…
$ q17_03b       <fct> NO, NO, NO, NA, NO, NO, NO, NO, NO, NO, NO, NO, NO, NA, …
$ q17_04        <fct> NA, NA, NA, "Device cost (cell phone, computer, tablet)"…
$ q12a          <fct> No, No, No, No, No, No, No, No, No, No, No, No, No, No, …
$ q12_01a       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ q12_01b       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ q12_02a       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ q12_02b       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ q12_03a       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ q12_03b       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ q12_04a       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ q12_04b       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ q12_05        <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ q13_01        <fct> House, House, House, House, House, House, House, House, …
$ q13_01_ot     <chr> "", "", "", "", "", "", "", "", "", "", "", "", "", "", …
$ q13_02        <fct> Owned (without mortgage), Owned (with mortgage), Owned (…
$ q13_03        <fct> Gov't rented/ Leased, Gov't rented/ Leased, Owned (with …
$ q13_04        <dbl> 1985, 2006, 2003, -1, 1982, 1980, 1982, 1992, 1980, 2005…
$ q13_05        <fct> Yes, No, No, No, Yes, Yes, Yes, No, Yes, Yes, Yes, No, N…
$ q13_06        <chr> "DAKPLATEN VERWISSELEN EN KEUKEN BIJGEBOU", "", "", "", …
$ q13_07        <dbl> 2020, NA, NA, NA, 2020, 2017, 2019, NA, 2017, 2021, 2020…
$ q13_08        <fct> Wood and building stones/bricks, Building stones, Wood a…
$ q13_09        <fct> Fully enclosed masonry, Fully enclosed masonry, Fully en…
$ q13_10        <fct> Wood, Tiles, Tiles, Masonry, Wood, Masonry, Masonry, Mas…
$ q13_11        <fct> "Metal roofing sheets/roof tiles (zinc, galva, aluminum)…
$ q13_12a       <fct> No, No, No, No, No, No, No, No, No, No, No, No, No, No, …
$ q13_12b       <fct> No, No, No, No, Yes, Yes, Yes, No, Yes, No, No, No, No, …
$ q13_12c       <fct> Yes, Yes, Yes, No, Yes, Yes, Yes, Yes, Yes, Yes, No, No,…
$ q13_12d       <fct> No, Yes, Yes, Yes, No, Yes, No, Yes, Yes, No, No, No, No…
$ q13_12e       <fct> Yes, Yes, Yes, Yes, Yes, No, Yes, No, No, No, No, No, No…
$ q13_12f       <fct> No, Yes, Yes, No, Yes, Yes, No, Yes, No, No, No, No, No,…
$ q13_12g       <fct> No, Yes, No, No, No, No, Yes, Yes, No, Yes, No, No, No, …
$ q13_13        <fct> Gas (propane), Gas (propane), None (does not cook), Gas …
$ q13_14        <fct> WC with flushing (linked to septic tank), WC with flushi…
$ q13_14_ot     <chr> "", "", "", "", "", "", "", "", "", "", "", "", "", "", …
$ q13_15        <fct> "Piped water in dwelling", "Piped water in dwelling", "P…
$ q13_16        <fct> SWM, SWM, SWM, SWM, SWM, SWM, SWM, SWM, SWM, SWM, SWM, S…
$ q13_17        <fct> Electricity directly from EBS, Electricity directly from…
$ q13_18        <dbl> 2, 4, 2, 2, 4, 3, 4, 5, 3, 4, 4, 4, 4, 3, 5, 2, 4, 4, 2,…
$ q13_19        <dbl> 2, 4, 2, 2, 2, 3, 4, 2, 3, 2, 3, 4, 3, 2, 5, 2, 3, 3, 2,…
$ q13_20        <dbl> 1, 2, 2, 1, 1, 2, 3, 1, 2, 3, 2, 2, 2, 1, 2, 2, 3, 2, 2,…
$ q13_21        <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ q13_22        <dbl> 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0,…
$ q13_23a       <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 1, 1,…
$ q13_23b       <dbl> 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,…
$ q13_23c       <dbl> 1, 0, 1, 1, 2, 2, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 4, 1, 2,…
$ q13_23d       <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,…
$ q13_23e       <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,…
$ q13_23f       <dbl> 1, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0,…
$ q13_23h       <dbl> 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,…
$ q13_23i       <dbl> 0, 1, 1, 0, 1, 2, 2, 1, 2, 1, 2, 1, 0, 0, 2, 0, 2, 0, 1,…
$ q13_23j       <dbl> 1, 2, 3, 1, 1, 2, 5, 1, 2, 1, 5, 4, 1, 0, 4, 0, 2, 0, 1,…
$ q13_23k       <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 2, 1, 1, 1,…
$ q13_23l       <dbl> 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 2, 1, 0, 1, 0,…
$ q13_23m       <dbl> 0, 1, 0, 0, 0, 0, 2, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0,…
$ q13_23n       <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
$ q13_24        <fct> Relatives, Relatives, Relatives, Relatives, This dweling…
$ q19_01a       <fct> Yes, Yes, No, Yes, Yes, No, No, No, No, No, Yes, No, No,…
$ q19_01b       <fct> Yes, No, No, Yes, Yes, No, No, No, No, No, No, No, No, N…
$ q19_01c       <fct> Yes, No, No, Yes, Yes, No, No, No, No, No, Yes, No, No, …
$ q19_01d       <fct> Yes, No, No, No, No, No, No, No, No, Yes, No, No, No, No…
$ q19_01e       <fct> Yes, Yes, No, Yes, No, No, No, No, No, No, Yes, No, No, …
$ q19_01f       <fct> Yes, No, No, No, No, No, No, No, No, No, No, No, Yes, No…
$ q19_01g       <fct> Yes, No, No, No, No, No, No, No, No, No, No, No, No, No,…
$ q19_01h       <fct> Yes, No, No, No, No, No, No, No, No, No, No, No, No, No,…
$ q20_01_2001a  <fct> No, No, No, No, No, No, No, No, No, Yes, No, No, Yes, No…
$ q20_01_2001b  <fct> Yes, No, No, Yes, No, No, No, No, No, Yes, No, No, No, N…
$ q20_01_2001c  <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ q20_01_2001d  <fct> Yes, Yes, No, Yes, Yes, No, No, No, No, No, No, Yes, Yes…
$ q20_01_2001e  <fct> Yes, No, No, Yes, No, No, No, No, No, Yes, No, Yes, Yes,…
$ q20_01_2001f  <fct> Yes, No, No, No, No, No, No, No, No, No, No, No, No, No,…
$ q20_01_2001g  <fct> No, Yes, No, No, No, No, Yes, No, No, No, No, No, No, No…
$ q20_01_2001h  <fct> No, No, No, No, No, No, No, No, No, No, No, No, No, No, …
$ q20_01_2001i  <chr> "", "", "", "", "", "", "", "", "", "", "", "", "", "", …
$ weight2       <dbl> 36.20733, 50.22768, 36.41680, 32.17578, 56.93408, 64.684…
$ weight3       <dbl> 36.20733, 150.68304, 36.41680, 128.70313, 56.93408, 129.…
$ quintile      <fct> Q5, Q3, Q5, Q2, Q5, Q5, Q4, Q5, Q4, Q5, Q4, Q5, Q4, Q2, …
$ decile        <fct> D10, D, D10, D4, D10, D10, D8, D10, D7, D10, D7, D9, D8,…
$ centile       <dbl> 100, 60, 98, 33, 98, 99, 75, 99, 66, 99, 65, 89, 78, 29,…
$ cons_pc       <dbl> 22965.266, 4823.205, 14460.016, 3115.352, 15724.735, 180…
$ food_pc       <dbl> 14684.6487, 1991.8493, 5201.2645, 1488.5779, 5734.7305, …
$ nonfood_pc    <dbl> 8280.6176, 2831.3561, 9258.7511, 1626.7739, 9990.0040, 1…
$ line_extreme  <dbl> 1011.0048, 1011.0048, 1011.0048, 1011.0048, 1011.0048, 1…
$ line_moderate <dbl> 2659.820, 2659.820, 2659.820, 2659.820, 2659.820, 2659.8…
$ poor_extreme  <fct> Not extreme poor, Not extreme poor, Not extreme poor, No…
$ poor_all      <fct> Not poor, Not poor, Not poor, Not poor, Not poor, Not po…
$ poor          <fct> Non poor, Non poor, Non poor, Non poor, Non poor, Non po…
$ Year          <dbl> 2022, 2022, 2022, 2022, 2022, 2022, 2022, 2022, 2022, 20…
$ CPI_june2017  <dbl> 127.2, 127.2, 127.2, 127.2, 127.2, 127.2, 127.2, 127.2, …
$ CPI_june2022  <dbl> 460.825, 460.825, 460.825, 460.825, 460.825, 460.825, 46…
$ CPI_2017_22   <dbl> 3.622838, 3.622838, 3.622838, 3.622838, 3.622838, 3.6228…

Pipes

Pipes are a way to chain together multiple operations in a more readable way. The pipe operator %>% takes the output of the left-hand side and passes it as the first argument to the function on the right-hand side. In R, there is now also the |> operator, which is a base R pipe.

Remember

glimpse(as_factor(stata_data))

With a pipe this would be

spss_data %>% 
  as_factor() %>% 
  glimpse()

and with the base R pipe |> this would be

spss_data |>
  as_factor() |>
  glimpse()
Rows: 1,539
Columns: 162
$ idnum       <fct> 5581, 5642, 4622, 4034, 9206, 2101, 3574, 709, 8666, 1566,…
$ pais        <fct> Suriname, Suriname, Suriname, Suriname, Suriname, Suriname…
$ nationality <fct> Surinamese, Surinamese, Surinamese, Surinamese, Surinamese…
$ estratopri  <fct> Wanica / Para, Paramaribo, Paramaribo, Wanica / Para, Para…
$ estratosec  <fct> "Medium (Between 3,000 and 10,000 inhabitants)", "Large (M…
$ strata      <fct> 2702, 2701, 2701, 2702, 2701, 2702, 2701, 2701, 2701, 2701…
$ prov        <fct> Wanica, Paramaribo, Paramaribo, Wanica, Paramaribo, Wanica…
$ municipio   <fct> Saramacca Polder, Flora, Flora, Saramacca Polder, Flora, S…
$ upm         <fct> 43, 21, 21, 43, 21, 43, 21, 21, 21, 21, 21, 21, 43, 21, 21…
$ ur          <fct> Rural, Urban, Urban, Rural, Urban, Rural, Urban, Urban, Ur…
$ cluster     <fct> 87, 67, 5, 234, 67, 234, 233, 233, 5, 92, 67, 233, 234, 5,…
$ year        <fct> 2023, 2023, 2023, 2023, 2023, 2023, 2023, 2023, 2023, 2023…
$ wave        <fct> 2023, 2023, 2023, 2023, 2023, 2023, 2023, 2023, 2023, 2023…
$ wt          <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ q1tc_r      <fct> NA, Woman/female, Man/male, Woman/female, Man/male, Woman/…
$ q2          <fct> 59, 61, 30, 36, 34, 53, 75, 27, 27, 50, 60, 34, 50, 30, 18…
$ a4n         <fct> "Economic issues", "Economic issues", "Economic issues", "…
$ soct2       <fct> Worse, Worse, Worse, Worse, Worse, Same, Worse, Worse, Wor…
$ idio2       <fct> Worse, Worse, Worse, Worse, Worse, Worse, Worse, Worse, Wo…
$ mesfut1     <fct> Not at all, Somewhat, A little, A little, Somewhat, Not at…
$ cp8         <fct> Never, Never, Never, Never, Never, Never, Never, Never, Ne…
$ it1         <fct> Somewhat trustworthy, Very trustworthy, Not very trustwort…
$ jc10        <fct> A military take-over of the state would be justified, A mi…
$ jc13        <fct> NA, NA, NA, A military take-over of the state would be jus…
$ jc15a       <fct> "No, it is not justified", NA, NA, "No, it is not justifie…
$ jc16a       <fct> NA, "No, it is not justified", "No, it is not justified", …
$ vic1ext     <fct> No, No, No, No, No, No, No, Yes, No, No, No, No, No, No, N…
$ aoj11       <fct> Somewhat safe, Somewhat safe, Very unsafe, Somewhat unsafe…
$ aoj12       <fct> NA, A lot, A lot, A lot, Some, None, Little, None, Some, S…
$ pese1       <fct> Lower, Lower, Higher, Lower, Lower, Higher, Lower, Higher,…
$ pese2       <fct> About the same, Higher, Higher, Lower, Lower, Higher, Abou…
$ aoj17       <fct> None, None, A lot, None, Somewhat, None, None, Little, Lit…
$ ivol24      <fct> No, NA, No, No, No, No, No, Yes, NA, NA, No, Yes, No, No, …
$ gang21      <fct> NA, NA, One person, NA, No, NA, NA, NA, One person, NA, NA…
$ gang22      <fct> NA, NA, Four or more persons, NA, No, NA, NA, One person, …
$ gang23      <fct> NA, NA, No, NA, Yes, NA, NA, No, NA, NA, NA, NA, No, NA, N…
$ gang24      <fct> NA, NA, Yes, NA, No, NA, NA, NA, Yes, NA, NA, NA, No, NA, …
$ aoj30       <fct> No, Yes, Yes, Yes, Yes, No, No, Yes, Yes, Yes, Yes, Yes, Y…
$ diso8       <fct> Not a problem, NA, Very serious, Not a problem, Not a prob…
$ drugt1      <fct> Very serious, Very serious, Very serious, Very serious, Ve…
$ countfair1  <fct> NA, Always, Always, Never, Sometimes, Always, Sometimes, N…
$ countfair3  <fct> Never, Never, Always, Sometimes, Sometimes, Always, Someti…
$ chm1bn      <fct> NA, NA, "To be able to vote to elect the authorities, even…
$ chm2bn      <fct> NA, NA, NA, "A system that guarantees access to a basic in…
$ b0          <fct> Not at all, 4, A lot, Not at all, 4, 4, 6, A lot, 3, A lot…
$ b1          <fct> A lot, 5, 4, 4, 5, A lot, 4, 3, 4, 5, 4, 4, 5, 6, 5, 4, 3,…
$ b2          <fct> 6, 4, 2, 4, 4, 2, 4, 5, 4, 6, 3, 3, 4, Not at all, 6, 5, N…
$ b3          <fct> 2, 3, Not at all, 4, 5, 2, 2, 2, 5, 3, 4, 2, 3, Not at all…
$ b4          <fct> 5, 5, Not at all, 4, 5, A lot, 3, Not at all, 5, 2, Not at…
$ b6          <fct> 4, 3, A lot, 4, A lot, A lot, A lot, 2, 5, 4, Not at all, …
$ b12         <fct> 3, 6, Not at all, 5, 4, Not at all, 5, Not at all, 5, Not …
$ b13         <fct> 2, 6, Not at all, Not at all, 5, Not at all, 4, 3, 3, 3, 4…
$ b18         <fct> Not at all, 5, Not at all, 5, 2, Not at all, 3, 2, 3, Not …
$ b21         <fct> A lot, 3, Not at all, Not at all, 4, Not at all, 4, Not at…
$ b21a        <fct> Not at all, 5, Not at all, Not at all, 4, Not at all, Not …
$ b21f        <fct> 2, 5, Not at all, Not at all, 4, Not at all, 5, 2, 2, 2, N…
$ b31         <fct> 3, 6, Not at all, 5, 3, Not at all, 4, 2, 2, Not at all, 5…
$ b37         <fct> 5, 6, Not at all, 5, 5, Not at all, 5, 3, 3, 5, 4, 4, 4, 4…
$ b47a        <fct> Not at all, 5, Not at all, 5, 3, Not at all, 5, 3, 5, 4, 6…
$ m1          <fct> Bad, Neither good nor bad (fair), Very bad, Very bad, Neit…
$ pop101      <fct> 2, 4, Strongly disagree, 4, 5, Strongly disagree, 2, Stron…
$ pop107      <fct> Strongly disagree, 3, Strongly agree, Strongly agree, 6, S…
$ ros4        <fct> Strongly agree, Strongly agree, 5, Strongly agree, 6, Stro…
$ ing4        <fct> Strongly disagree, Strongly agree, 2, 4, 4, Strongly disag…
$ eff1        <fct> 3, 5, Strongly disagree, Strongly disagree, 3, Strongly di…
$ eff2        <fct> 2, 5, 6, 4, 5, Strongly disagree, 4, Strongly disagree, 5,…
$ dra1        <fct> 5, 6, Strongly agree, 4, 5, Strongly disagree, Strongly di…
$ dra2        <fct> Strongly agree, 6, Strongly agree, 4, 5, Strongly disagree…
$ vb21n       <fct> "It is not possible to have an influence to change things"…
$ crg1        <fct> "No, it is not justifiable", "No, it is not justifiable", …
$ crg2        <fct> "No, it is not justifiable", "No, it is not justifiable", …
$ env2b       <fct> Very serious, Very serious, Very serious, Very serious, So…
$ anestg      <fct> Somewhat, A lot, Not at all, Not at all, Somewhat, Not at …
$ pn4         <fct> Dissatisfied, Satisfied, Dissatisfied, Dissatisfied, Dissa…
$ dem30       <fct> Yes, Yes, Yes, Yes, Yes, No, Yes, Yes, No, Yes, Yes, Yes, …
$ e5          <fct> Strongly disapprove, 8, Strongly approve, Strongly disappr…
$ e20sur      <fct> 2, Strongly disapprove, 8, Strongly approve, 7, Strongly d…
$ e17a        <fct> 3, NA, Strongly approve, Strongly approve, 9, NA, 9, Stron…
$ e17b        <fct> NA, 8, NA, NA, NA, Strongly disapprove, NA, NA, NA, 6, Str…
$ d3          <fct> 5, 4, 6, Strongly disapprove, 6, Strongly disapprove, 3, S…
$ d4          <fct> 7, 7, Strongly disapprove, 4, 5, Strongly disapprove, 2, S…
$ d6          <fct> Strongly disapprove, 5, Strongly disapprove, Strongly disa…
$ d5newa      <fct> NA, 6, Strongly disapprove, NA, NA, Strongly disapprove, N…
$ d5newb      <fct> 7, NA, NA, 5, 5, NA, Strongly disapprove, NA, NA, NA, NA, …
$ exc2        <fct> No, No, No, No, No, No, No, No, Yes, No, No, No, No, No, N…
$ exc6        <fct> No, No, No, No, No, No, No, No, Yes, No, No, No, No, No, N…
$ exc7new     <fct> NA, Less than half of them, All, Less than half of them, M…
$ aoj2cora    <fct> No, Yes, No, No, No, No, Yes, No, No, Yes, No, Yes, No, NA…
$ aoj2corb    <fct> NA, Not so serious, NA, NA, NA, NA, Very serious, NA, NA, …
$ lib2c       <fct> Enough, Very little, Very little, Very little, Very little…
$ vb2         <fct> Voted, Did not vote, Did not vote, Voted, Voted, Voted, Di…
$ vb3n        <fct> NDP - National Democratic Party, NA, NA, VHP - Progressive…
$ vb10        <fct> No, No, No, No, Yes, No, No, No, No, No, Yes, No, No, No, …
$ vb11        <fct> NA, NA, NA, NA, National Democratic Party - NDP, NA, NA, N…
$ pol1        <fct> None, Some, None, A lot, Little, None, Some, None, Some, S…
$ vb20        <fct> Would vote for a candidate or party different from the cur…
$ vb50        <fct> Strongly disagree, Disagree, Strongly disagree, Disagree, …
$ vb51        <fct> Both the same, NA, NA, Both the same, Both the same, Both …
$ vb52        <fct> NA, A woman, It does not matter, NA, NA, NA, NA, It does n…
$ vb58        <fct> NA, NA, NA, Disagree, Disagree, Disagree, Agree, NA, Disag…
$ vb58exp     <fct> Disagree, Disagree, Disagree, NA, NA, NA, NA, Strongly agr…
$ dvw1        <fct> Would not approve but understand, Would not approve nor un…
$ dvw2        <fct> Would not approve nor understand, Would not approve nor un…
$ for5n       <fct> India, United States, United States, United States, United…
$ mil10a      <fct> Somewhat trustworthy, Not very trustworthy, Somewhat trust…
$ mil10b      <fct> Not at all trustworthy, Not very trustworthy, NA, NA, Not …
$ mil10e      <fct> Somewhat trustworthy, Somewhat trustworthy, Somewhat trust…
$ dis11       <fct> Never, Never, Many times, Never, Never, Never, Never, Some…
$ dis12       <fct> Never, Never, Many times, Never, Never, Never, Never, Some…
$ mgm1        <fct> Very harmful, Very harmful, Very harmful, Very harmful, So…
$ mgm2        <fct> There Is no gold mining where you live, There Is no gold m…
$ mgm3        <fct> NA, Too much, Very little, Too much, Very little, NA, Very…
$ wf1         <fct> No, No, No, No, No, No, No, No, No, No, No, No, No, No, No…
$ edre        <fct> "Primary School incomplete", "Secondary (Middle through Hi…
$ q3cn        <fct> "Hindu", NA, "None (Believes in a Supreme Entity but does …
$ q5b         <fct> Very important, Very important, Very important, Very impor…
$ ocup4a      <fct> "Not working and not looking for a job?", "Taking care of …
$ formal      <fct> NA, NA, No, NA, Yes, Yes, NA, Yes, Yes, Yes, No, Yes, Yes,…
$ q10inc      <fct> "Between SRD 1,501 - SRD 2,500", "Between SRD 5,201 - SRD …
$ q10e        <fct> Decreased?, Remained the same?, Remained the same?, Remain…
$ q14         <fct> No, No, Yes, Yes, Yes, No, No, No, No, No, Yes, Yes, Yes, …
$ fs2         <fct> No, No, No, Yes, No, No, No, Yes, Yes, Yes, No, No, Yes, N…
$ fs212       <fct> No, No, No, NA, Yes, No, No, NA, NA, NA, No, No, NA, No, N…
$ fs8         <fct> No, No, Yes, No, No, Yes, No, No, Yes, No, No, No, Yes, No…
$ ws1         <fct> No, No, No, No, No, No, No, No, No, Yes, No, No, No, No, N…
$ ws2         <fct> NA, NA, NA, NA, NA, NA, NA, NA, NA, Always, NA, NA, NA, NA…
$ q11n        <fct> Married, Married, Single, Common law marriage (Living toge…
$ q12cn       <fct> 3, 2, 9, 6, 5, 6, 2, 2, 4, 2, 4, 6, 4, 4, 8, 6, 4, 1, 5, 7…
$ q12bn       <fct> None, None, 2, 2, 1, None, None, None, None, None, None, 2…
$ q12bnf      <fct> NA, NA, "No", "Yes, parent", "Yes, parent", NA, NA, NA, NA…
$ q12p        <fct> NA, 22, NA, 20, NA, 23, 29, NA, NA, 17, NA, 19, NA, 26, NA…
$ etid        <fct> Hindustani ("East Indians"), Mixed, Afro-Surinamese/Creole…
$ leng1       <fct> Sarnami, Dutch, Dutch, Sarnami, Dutch, Sarnami, Dutch, Dut…
$ gi0n        <fct> A few times a month, Daily, Daily, Daily, A few times a we…
$ smedia1n    <fct> No, Yes, Yes, No, Yes, No, Yes, Yes, Yes, Yes, Yes, Yes, Y…
$ smedia3n    <fct> NA, A few times a week, Daily, NA, Daily, NA, Daily, Daily…
$ smedia3b    <fct> NA, NA, Several times a day, NA, Several times a day, NA, …
$ smedia11    <fct> No, No, No, Yes, Yes, No, Yes, Yes, Yes, No, Yes, Yes, No,…
$ smedia12    <fct> No, No, No, Yes, No, No, No, Yes, No, No, No, No, No, No, …
$ smedia13    <fct> Not at all, Very much, Somewhat, Not at all, Somewhat, Not…
$ smedia14n   <fct> Not at all, Somewhat, Very much, Somewhat, Somewhat, Not a…
$ smedia15    <fct> Not at all, A Little, Very much, Not at all, A Little, Not…
$ smedia16    <fct> Phone call, Text message, In person, Phone call, Text mess…
$ r3          <fct> No, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes,…
$ r4a         <fct> Yes, Yes, Yes, Yes, Yes, No, Yes, Yes, Yes, Yes, Yes, Yes,…
$ r6          <fct> Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes…
$ r7          <fct> No, Yes, No, Yes, Yes, Yes, No, Yes, Yes, Yes, Yes, Yes, N…
$ r12         <fct> Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes…
$ r15         <fct> No, Yes, Yes, No, Yes, No, No, Yes, No, No, Yes, Yes, Yes,…
$ r18n        <fct> No, Yes, Yes, No, Yes, No, No, Yes, Yes, No, Yes, Yes, Yes…
$ r18         <fct> No, Yes, Yes, No, Yes, No, Yes, Yes, Yes, Yes, Yes, Yes, Y…
$ r16         <fct> Yes, Yes, Yes, No, Yes, Yes, Yes, Yes, Yes, Yes, Yes, Yes,…
$ r27         <fct> No, Yes, Yes, No, Yes, No, No, No, No, No, Yes, No, Yes, N…
$ colorr      <fct> 5, 3, 8, 4, 9, 5, 9, 3, 9, 7, 3, 6, 3, 5, 4, 4, 5, 3, 8, 5…
$ surlangq    <fct> The full interview was conducted in Sranan Tongo, The full…
$ noise1      <fct> "No", "No", "No", "No", "No", "No", "No", "No", "No", "No"…
$ conocim     <fct> Very low, High, Neither high or low, Low, Neither high or …
$ sexin       <fct> Female/Woman, Female/Woman, Female/Woman, Female/Woman, Fe…
$ colori      <fct> 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9…
$ formatq     <fct> STG, STG, STG, STG, STG, STG, STG, STG, STG, STG, STG, STG…
$ idiomaq     <fct> See SURLANGQ, See SURLANGQ, See SURLANGQ, See SURLANGQ, Se…
$ fecha       <chr> "20apr2023", "28mar2023", "01apr2023", "12apr2023", "30mar…

Pipes in detail

  • %>% is the pipe operator from the magrittr package
  • |> is the base R pipe operator introduced in R 4.1.0

Both operators allow you to chain together multiple operations in a more readable way

A pipe is a way to pass the output of one function as the input to another function, without having to create intermediate variables.

A %>% B is equivalent to B(A), where A is the output of the left-hand side and B is the function on the right-hand side. A is expected to be the first argument in function B.

Warning

The next step in a pipe is always expected to be a function! If the next step is not a function, you need to be clever, otherwise you’ll get an error.

What if the next step is not the first argument?

You can use the placeholder . (or _ with the native R pipe) to indicate where the output of the previous step should go in the next function.

For example,

stata_data %>% 
  filter(q13_05 == 1 | q13_05 == 2) %>%
  t.test(Year_s ~ q13_05, data = .)

    Welch Two Sample t-test

data:  Year_s by q13_05
t = -1, df = 1429, p-value = 0.3175
alternative hypothesis: true difference in means between group 1 and group 2 is not equal to 0
95 percent confidence interval:
 -0.0020710668  0.0006724654
sample estimates:
mean in group 1 mean in group 2 
       2022.000        2022.001 

What if the next step is not the first argument?

You can use the placeholder . (or _ with the native R pipe) to indicate where the output of the previous step should go in the next function.

For example,

stata_data %>% 
  filter(q13_05 == 1 | q13_05 == 2) |>
  t.test(Year_s ~ q13_05, data = _)

    Welch Two Sample t-test

data:  Year_s by q13_05
t = -1, df = 1429, p-value = 0.3175
alternative hypothesis: true difference in means between group 1 and group 2 is not equal to 0
95 percent confidence interval:
 -0.0020710668  0.0006724654
sample estimates:
mean in group 1 mean in group 2 
       2022.000        2022.001 

Other pipes

  • %$% is the exposition pipe from the magrittr package, which allows you to use the names of the variables in the data frame directly without having to use the $ operator.
  • %<>% is the assignment pipe from the magrittr package, which allows you to modify the data frame in place, without the need for calling assign() or <- again.

There are more pipes (like the %T>% pipe), but they can be very confusing and we therefore skip them in this course.

stata_data %>% 
  filter(q13_05 == 1 | q13_05 == 2) %$% # Note the exposition pipe
  t.test(Year_s ~ q13_05)

    Welch Two Sample t-test

data:  Year_s by q13_05
t = -1, df = 1429, p-value = 0.3175
alternative hypothesis: true difference in means between group 1 and group 2 is not equal to 0
95 percent confidence interval:
 -0.0020710668  0.0006724654
sample estimates:
mean in group 1 mean in group 2 
       2022.000        2022.001 

Renaming variables with a pipe

spss_data <- spss_data %>%
  rename(gender = q1tc_r)
spss_data <- spss_data %<>%
  rename(gender = q1tc_r)

spss_data$gender %>% head()
[1] <NA>         Woman/female Man/male     Woman/female Man/male    
[6] Woman/female
6 Levels: Man/male Woman/female ... N/A

Recoding with mutate() en recode()

spss_data <- spss_data %>%
  mutate(gender_rec = recode(gender, 
                             "Man/male" = "male", 
                             "Woman/female" = "female"))

spss_data$gender %>% head()
[1] <NA>         Woman/female Man/male     Woman/female Man/male    
[6] Woman/female
6 Levels: Man/male Woman/female ... N/A
spss_data$gender_rec %>% head()
[1] <NA>   female male   female male   female
Levels: male female Does not identify as either man or woman DK NR N/A

Labeling variables

With labeled variables, we can add an additional layer of description to variable, very similar to what SPSS and STATA do.

spss_data <- set_variable_labels(spss_data, gender_rec = "Gerecodeerd geslacht")
spss_data$gender_rec %>% glimpse()
 Factor w/ 6 levels "male","female",..: NA 2 1 2 1 2 2 1 1 2 ...
 - attr(*, "label")= chr "Gerecodeerd geslacht"

Selecting and filtering

spss_data %<>%
  rename(age = q2)

spss_data %>%
  filter(age > 18) %>%
  select(age, gender) %>%
  summary()
      age                                         gender 
 18     :0   Man/male                                :0  
 19     :0   Woman/female                            :0  
 20     :0   Does not identify as either man or woman:0  
 21     :0   DK                                      :0  
 22     :0   NR                                      :0  
 23     :0   N/A                                     :0  
 (Other):0                                               

Calculations and summarising

spss_data %>%
  filter(age > 18) %>%
  summarise(mean_age = mean(age, na.rm = TRUE))
# A tibble: 1 × 1
  mean_age
     <dbl>
1       NA

age is also a factor, hence mean() is meaningless. We have to convert age to numeric:

spss_data %>%
  mutate(age = as.numeric(age)) %>% 
  filter(age > 18) %>%
  summarise(mean_age = mean(age, na.rm = TRUE))
# A tibble: 1 × 1
  mean_age
     <dbl>
1     35.6

Modeling in R

To model objects based on other objects, we use ~ (tilde)

For example, to model body mass index (BMI) on weight, we would type

BMI ~ weight

The ~ is used to separate the left- and right-hand sides in a model formula.

For functions (or models), within models we use I() - For example, to model body mass index (BMI) on its deterministic function of weight and height, we would type

BMI ~ I(weight / height^2)

Modeling continued

We already saw the use of the ~ operator in the t.test() function, where we specified the outcome variable on the left-hand side and the grouping variable on the right-hand side.

stata_data %>% 
  filter(q13_05 == 1 | q13_05 == 2) %$%
  t.test(Year_s ~ q13_05)

Using formulas

# Use a formula in function lm() in a pipe
mtcars %>%
  lm(mpg ~ wt + hp, data = .) 

Call:
lm(formula = mpg ~ wt + hp, data = .)

Coefficients:
(Intercept)           wt           hp  
   37.22727     -3.87783     -0.03177  

Using formulas with broom

With the broom package, we can easily tidy up the output of models and other functions that return complex objects.

mtcars %>%
  lm(mpg ~ wt + hp, data = .) %>% 
  tidy()
# A tibble: 3 × 5
  term        estimate std.error statistic  p.value
  <chr>          <dbl>     <dbl>     <dbl>    <dbl>
1 (Intercept)  37.2      1.60        23.3  2.57e-20
2 wt           -3.88     0.633       -6.13 1.12e- 6
3 hp           -0.0318   0.00903     -3.52 1.45e- 3

Tables and Contingency Tables

stata_data %$%
  table(district)
district
Brokopondo Commewijne    Coronie  Marowijne   Nickerie       Para Paramaribo 
         7        170         33         75        213         27        945 
 Saramacca Sipaliwini     Wanica 
       233        118        681 
stata_data %$%
  table(district, stratum)
            stratum
district       1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16
  Brokopondo   0   0   0   0   0   0   0   0   4   0   0   0   0   0   0   3
  Commewijne   0   0   0   0   0  46   0   0   0 124   0   0   0   0   0   0
  Coronie      0   0   0  33   0   0   0   0   0   0   0   0   0   0   0   0
  Marowijne    0   0   0   0   0  47   0   0   0   0   0   0   0   0   0  28
  Nickerie     0   0   0 213   0   0   0   0   0   0   0   0   0   0   0   0
  Para         0   0   0   0   0   0   0   0  22   0   0   0   0   0   0   5
  Paramaribo 150 147 112   0 165   0   0   0   0   0 120 129 122   0   0   0
  Saramacca    0   0   0   0   0   0   0 182  51   0   0   0   0   0   0   0
  Sipaliwini   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0 118
  Wanica       0   0   0   0   0   0 233   0  57   0   0   0   0 164 227   0

Tables

stata_data %$%
  table(district, stratum) %>% # calculate table
  prop.table() %>% # convert to proportions
  round(3) # round to 3 decimals
            stratum
district         1     2     3     4     5     6     7     8     9    10    11
  Brokopondo 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.002 0.000 0.000
  Commewijne 0.000 0.000 0.000 0.000 0.000 0.018 0.000 0.000 0.000 0.050 0.000
  Coronie    0.000 0.000 0.000 0.013 0.000 0.000 0.000 0.000 0.000 0.000 0.000
  Marowijne  0.000 0.000 0.000 0.000 0.000 0.019 0.000 0.000 0.000 0.000 0.000
  Nickerie   0.000 0.000 0.000 0.085 0.000 0.000 0.000 0.000 0.000 0.000 0.000
  Para       0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.009 0.000 0.000
  Paramaribo 0.060 0.059 0.045 0.000 0.066 0.000 0.000 0.000 0.000 0.000 0.048
  Saramacca  0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.073 0.020 0.000 0.000
  Sipaliwini 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
  Wanica     0.000 0.000 0.000 0.000 0.000 0.000 0.093 0.000 0.023 0.000 0.000
            stratum
district        12    13    14    15    16
  Brokopondo 0.000 0.000 0.000 0.000 0.001
  Commewijne 0.000 0.000 0.000 0.000 0.000
  Coronie    0.000 0.000 0.000 0.000 0.000
  Marowijne  0.000 0.000 0.000 0.000 0.011
  Nickerie   0.000 0.000 0.000 0.000 0.000
  Para       0.000 0.000 0.000 0.000 0.002
  Paramaribo 0.052 0.049 0.000 0.000 0.000
  Saramacca  0.000 0.000 0.000 0.000 0.000
  Sipaliwini 0.000 0.000 0.000 0.000 0.047
  Wanica     0.000 0.000 0.066 0.091 0.000

Tibbles vs data.frames

Tibbles are a modern re-imagining of data frames in R. They are part of the tidyverse and provide a more user-friendly interface for working with data.

is_tibble(band_members)
[1] TRUE
band_members 
# A tibble: 3 × 2
  name  band   
  <chr> <chr>  
1 Mick  Stones 
2 John  Beatles
3 Paul  Beatles
band_members %>% 
  as.data.frame()
  name    band
1 Mick  Stones
2 John Beatles
3 Paul Beatles

Layers in R

There are several ‘layers’ in R. Some layers you are allowed to fiddle around in, some are forbidden. In general there is the following distinction:

  • The global environment.
  • User environments
  • Functions
  • Packages
  • Namespaces

Environments

The global environment can be seen as an olympic-size swimming pool. Everything you do has its place there.

If you’d like, you may create another, separate environment to work in.

  • A user environment would by default not have access to other environments

Functions

  • If you create a function, it is positioned in the global environment.

  • Everything that happens in a function, stays in a function. Unless you specifically tell the function to share the information with the global environment.

  • See functions as a shampoo bottle in a swimming pool to which you add some water. If you’d like to see the color of the mixture, you’d have to squeeze the bottle for it to come out.

Package and Namespaces

Namespaces are a way to organize functions and data in R. Every package has its own namespace, which means that functions and data in one package do not interfere with functions and data in another package.

- Everything needed to run the functions in a package is neatly contained within its own space
- See packages as separate (mini) pools that are connected to the main pool (the global environment)

%in%

The %in% operator is used to check if elements of one vector are present in another vector. It returns a logical vector indicating whether each element of the first vector is found in the second vector.

x <- c(1, 2, 3, 4, 5)
y <- c(3, 4, 5, 6, 7)
x %in% y
[1] FALSE FALSE  TRUE  TRUE  TRUE

grepl()

The grepl() function is used to search for a pattern in a character vector. It returns a logical vector indicating whether the pattern is found in each element of the character vector.

x <- c("apple", "banana", "cherry", "date")
pattern <- "a"
grepl(pattern, x)
[1]  TRUE  TRUE FALSE  TRUE
grepl("^a", x) # starts with a
[1]  TRUE FALSE FALSE FALSE
grepl("e$", x) # ends with e
[1]  TRUE FALSE FALSE  TRUE
grepl("cherry", x)
[1] FALSE FALSE  TRUE FALSE

Anscombe data

anscombe |> 
  as_tibble()
# A tibble: 11 × 8
      x1    x2    x3    x4    y1    y2    y3    y4
   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
 1    10    10    10     8  8.04  9.14  7.46  6.58
 2     8     8     8     8  6.95  8.14  6.77  5.76
 3    13    13    13     8  7.58  8.74 12.7   7.71
 4     9     9     9     8  8.81  8.77  7.11  8.84
 5    11    11    11     8  8.33  9.26  7.81  8.47
 6    14    14    14     8  9.96  8.1   8.84  7.04
 7     6     6     6     8  7.24  6.13  6.08  5.25
 8     4     4     4    19  4.26  3.1   5.39 12.5 
 9    12    12    12     8 10.8   9.13  8.15  5.56
10     7     7     7     8  4.82  7.26  6.42  7.91
11     5     5     5     8  5.68  4.74  5.73  6.89

Anscombe data properties

anscombe |>
  cor() |>
  round(2)
      x1    x2    x3    x4    y1    y2    y3    y4
x1  1.00  1.00  1.00 -0.50  0.82  0.82  0.82 -0.31
x2  1.00  1.00  1.00 -0.50  0.82  0.82  0.82 -0.31
x3  1.00  1.00  1.00 -0.50  0.82  0.82  0.82 -0.31
x4 -0.50 -0.50 -0.50  1.00 -0.53 -0.72 -0.34  0.82
y1  0.82  0.82  0.82 -0.53  1.00  0.75  0.47 -0.49
y2  0.82  0.82  0.82 -0.72  0.75  1.00  0.59 -0.48
y3  0.82  0.82  0.82 -0.34  0.47  0.59  1.00 -0.16
y4 -0.31 -0.31 -0.31  0.82 -0.49 -0.48 -0.16  1.00

Adding some random numbers

anscombe_new <- 
  anscombe |> 
  mutate(x1 = x1 + rnorm(nrow(anscombe), mean = 0, sd = 0.1),
         x2 = x2 + rnorm(nrow(anscombe), mean = 0, sd = 0.1),
         x3 = x3 + rnorm(nrow(anscombe), mean = 0, sd = 0.1),
         x4 = x4 + rnorm(nrow(anscombe), mean = 0, sd = 0.1)) |> 
  as_tibble()

anscombe_new data properties

anscombe_new |>
  cor() |>
  round(2)
      x1    x2    x3    x4    y1    y2    y3    y4
x1  1.00  1.00  1.00 -0.51  0.82  0.82  0.82 -0.32
x2  1.00  1.00  1.00 -0.50  0.81  0.81  0.81 -0.31
x3  1.00  1.00  1.00 -0.50  0.81  0.82  0.81 -0.31
x4 -0.51 -0.50 -0.50  1.00 -0.53 -0.72 -0.35  0.81
y1  0.82  0.81  0.81 -0.53  1.00  0.75  0.47 -0.49
y2  0.82  0.81  0.82 -0.72  0.75  1.00  0.59 -0.48
y3  0.82  0.81  0.81 -0.35  0.47  0.59  1.00 -0.16
y4 -0.32 -0.31 -0.31  0.81 -0.49 -0.48 -0.16  1.00

Practical